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ABSTRACT 



Given a replicated database, a divergent design tunes the indexes 
in each replica differently in order to specialize it for a specific 
subset of the workload. This specialization brings significant per- 
formance gains compared to the common practice of having the 
same indexes in all replicas, but requires the development of new 
tuning tools for database administrators. In this paper we intro- 
duce RITA (Replication-aware Index Tuning Advisor), a novel 
divergent-tuning advisor that offers several essential features not 
found in existing tools: it generates robust divergent designs that 
allow the system to adapt gracefully to replica failures; it com- 
putes designs that spread the load evenly among specialized repli- 
cas, both during normal operation and when replicas fail; it mon- 
itors the workload online in order to detect changes that require a 
recomputation of the divergent design; and, it offers suggestions to 
elastically reconfigure the system (by adding/removing replicas or 
adding/dropping indexes) to respond to workload changes. The key 
technical innovation behind RITA is showing that the problem of 
selecting an optimal design can be formulated as a Binary Integer 
Program (BIP). The BIP has a relatively small number of variables, 
which makes it feasible to solve it efficiently using any off-the- 
shelf linear-optimization software. Experimental results demon- 
strate that RITA computes better divergent designs compared to 
existing tools, offers more features, and has fast execution times. 

1. INTRODUCTION 

Database replication is used heavily in distributed systems and 
database-as-a-service platforms (e.g., Amazon's Relational Database 
Service |T) or Microsoft SQL Azure [2|), to increase availabil- 
ity and to improve performance through parallel processing. The 
database is typically replicated across several nodes, and replicas 
are kept synchronized (eagerly or lazily) when updates occur so 
that incoming queries can be evaluated on any replica. 

Divergent designs |5| represent a new paradigm to tune work- 
load performance over a replicated database. A divergent design 
leverages replication as follows: it specializes each replica to a spe- 
cific subset of the workload by installing indexes that are particu- 
larly beneficial for the corresponding workload statements. Thus, 
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queries can be evaluated more efficiently by being routed to a spe- 
cialized replica. As shown in a previous study, a divergent design 
brings significant performance improvements when compared to a 
uniform design that uses the same indexes in all replicas: queries 
are executed faster due to replica specialization (up to 2x improve- 
ment on standard benchmarks), but updates as well become sig- 
nificantly more efficient (more than 2x improvement) since fewer 
indexes need to be installed per replica. 

To reap the benefits of divergent designs in practice, DB admin- 
istrators need new index-tuning advisors that are replication-aware. 
The original study |5 1 introduces an advisor called DlVGDESIGN, 
which creates specialized designs per replica but has severe limita- 
tions that restrict its usefulness in practice. Firstly, DlVGDESIGN 
assumes that replicas are always operational. Replica failures, how- 
ever, are common in real systems, and the resulting workload redis- 
tribution may cause queries to be routed to low-performing repli- 
cas, with predictably negative effects on the overall system per- 
formance. An effective advisor should generate robust divergent 
designs that allow the system to adapt gracefully to replica failures. 
Secondly, DlVGDESIGN ignores the effect of specialization to each 
replica's load, and can therefore incur a skewed load distribution in 
the system. Our experiments suggest that DlVGDESIGN can cause 
certain replicas to be twice as loaded as others. A good advisor 
should take the replica load into account, and generate divergent 
designs that provide the benefits of specialization while maintain- 
ing a balanced load distribution. Lastly, DlVGDESIGN essentially 
targets a static system where the database workload and the num- 
ber of replicas are assumed to remain unchanged. A replicated 
database system, however, is typically quite volatile: the work- 
load may change over time, and in response the DBA may wish 
to elastically reconfigure the system by expanding or shrinking the 
set of replicas and by incrementally adding or dropping indexes 
at different replicas. A replication-aware advisor should alert the 
DBA when a workload change necessitates to retune the divergent 
design, and also help the DBA evaluate options for changing the 
design. 

The limitations of DlVGDESIGN stem from the fact that it in- 
ternally employs a conventional index-tuning advisor, e.g., DB2's 
db2advis or the index advisor of MS SQL Server, which is not 
suitable for modeling and solving the aforementioned issues. Mod- 
ifying DlVGDESIGN to address its limitations would require a com- 
plete redesign of the advisor which is far from trivial. Moreover, 
one can question the feasibility of divergent designs in practice: 
can we reap the performance benefits demonstrated in [5 1 when we 
impose constraints on load-balancing and failure adaptation, or is 
there an inherent tradeoff between the ability to specialize replicas 
and these important constraints? Answering this question is key to 
developing practically efficient tools for divergent design tuning. 



Contributions. In this paper, we introduce a novel index advi- 
sor termed RITA (Replication-aware Index Tuning Advisor) that 
provides DBAs with a powerful tool for divergent index tuning. In- 
stead of relying on conventional techniques for index tuning, RITA 
is a new type of index advisor that is designed from the ground 
up for divergent designs. RITA's foundation is a novel reduction 
of the problem of divergent design tuning to Binary Integer Pro- 
gramming (BIP). The BIP formulation allows RITA to employ an 
off-the-shelf linear optimization solver to compute near-optimal de- 
signs that satisfy complex constraints (e.g., even load distribution 
or robustness to failures). Compared to DlVGDESIGN, RITA of- 
fers richer tuning functionality and is able to compute divergent 
designs that result in significantly better performance. 

More concretely, the contributions of our work can be summa- 
rized as follows: 

• (Section [3} To make divergent designs suitable for the charac- 
teristics of real-world systems, we introduce a generalized ver- 
sion of the problem of divergent design tuning that has two im- 
portant features: it takes into account the probability of replica 
failures and their effect on workload performance; and, it allows 
for an expanded class of constraints on the computed divergent 
design and in particular constraints on global system-properties, 
e.g., maintaining an even load distribution. 

• (Section[4} We prove that, under realistic assumptions about the 
underlying system, the generalized tuning problem can be formu- 
lated as a compact Binary Integer Program (BIP), i.e., a linear- 
optimization problem with a relatively small number of binary 
variables. The implication is that we can use an off-the-shelf 
solver to efficiently compute a (near-)optimal divergent design 
that also satisfies any given constraints. 

• (Section [5} We propose RITA as a new index-tuning tool that 
leverages the previous theoretical result to implement a unique set 
of features. RITA allows the DBA to initially tune the divergent 
design of the system using a training workload. Subsequently, 
RITA continuously analyzes the incoming workload and alerts 
the DBA if a retuning of the divergent design could lead to sub- 
stantial performance improvements. The DBA can then exam- 
ine how to elastically adapt the divergent design to the changed 
workload, e.g., by expanding/shrinking the set of replicas, in- 
crementally adding/removing indexes, or changing how queries 
are distributed across replicas. Internally, RITA translates the 
DBA's requests to BIPs that are solved efficiently by a linear- 
optimization solver. In fact, RITA often returns its answers in 
seconds, thus facilitating an exploratory approach to index tun- 
ing. 

• (Section [6]l We perform an extensive experimental study to val- 
idate the effectiveness of RITA as a tuning advisor. The re- 
sults show that the designs computed by RITA can improve sys- 
tem performance by up to a factor of two compared to the stan- 
dard uniform design that places the same indexes on all repli- 
cas. Moreover, RITA outperforms DlVGDESIGN by up to 35% 
in terms of the performance of the computed divergent designs, 
while supporting a larger class of constraints. 

Overall, RITA provides a positive answer to our previously stated 
question: a divergent design can bring significant performance ben- 
efits while maintaining important properties such as a balanced 
load distribution and tolerance to failures. Consequently, diver- 
gent design advisors can be practically employed on real systems 
and guide further development of tuning tools. The underlying the- 
oretical results (problem definition and BIP formulation) are also 
significant, as they expand on the previous work on single-system 
tuning |6| and demonstrate a wider applicability of Binary Integer 



Programming to index-tuning problems. 

2. RELATED WORK 

Physical configuration tuning. There has been a long line of 
research studies on the problem of tuning the index configuration 
of a single DBMS Bill [HOI)- These methods typically take a 
representative workload W as input, and after some analysis they 
recommend an index configuration that optimizes the evaluation of 
the workload according to the optimizer's estimates. 

A recent work has introduced COPHY [6|, a new index advi- 
sor paradigm, that outperforms state-of-the-art commercial and re- 
search techniques by a significant margin (up to an order of mag- 
nitude) both in solution quality and in total execution time. Both 
RITA and COPHY leverage the same underlying principle of lin- 
ear composability, which we will define and discuss extensively in 
Section l4~Tl in order to cast the index-tuning problem as a compact, 
efficiently-solvable Binary Integer Program (BIP). However, CO- 
PHY targets the conventional index-tuning problem where the goal 
is to compute a single index configuration for a single-node sys- 
tem. This problem scenario is much simpler than what we consider 
in our work, where there are several nodes in the system, each can 
carry a different index configuration, queries have to be distributed 
in a balanced fashion and the system must recover gracefully from 
failures. Leveraging the principle of linear composability in this 
generalized problem scenario is one of the key contributions of our 
work. 

Shinobi |17] is a system that utilizes workload information to 
partition the data and selectively index data within each partition. 
This results in less expensive index maintenance and reorganization 
costs, by creating and dropping indexes on subsets of the data (the 
workload-based partitions) as the access patterns change. However, 
Shinobi does not address replication of partitions or different index 
configurations on replicas of the same partition, which is the prob- 
lem that we examine in our work. In fact, our techniques can be 
used to determine which indexes to install on each replica, and then 
Shinobi can be responsible for maintaining only the fragments of 
these indexes that are important for the current workload patterns. 

Physical data organization on replicas. Previous works also 
considered the idea of diverging the physical organization of repli- 
cated data. The technique of Fractured Mirrors [12] builds a mir- 
rored database that stores its base data in a different physical orga- 
nizations on disk (specifically, in a row-based and a column-based 
organization). To take advantage of this storage model, the query 
processor is modified to run query execution plans that can work on 
both formats of the data. Similarly, Distorted Mirrors [14 presents 
logically but not physically identical mirror disks for replicated 
data. Neither of these explores different index organization for each 
mirror. 

Likewise, TROJAN HDFS |9] organizes data blocks of Hadoop 
Distributed File System (HDFS) into attribute groups according the 
workload in order to improve data access times; each data block 
replica might have different attribute groups. However, this work 
does not handle load balancing and node-failures as our work does. 
A recent work, HAIL [8], keeps the existing physical replicas of 
an HDFS block in different sort orders and with different clustered 
indexes. As this system targets for applications that analyze web- 
log data sets that have a few attributes, it is feasible to create a few 
indexes (e.g, three indexes) on these attributes. This work cannot 
create indexes for data sets that have large number of attributes (i.e., 
the data set contains more attributes than the number of replicas). 
In contrast, our work handles a very large set of indexes (e.g, in the 
order of hundreds). 



The previous work in [5] introduced a divergent design advisor, 
termed DlVGDESIGN, that builds on the functionality of an exist- 
ing single-system index advisor in order to compute a divergent de- 
sign. DlVGDESIGN is fundamentally limited by the functionality 
of the underlying single-system advisor, and cannot support many 
essential tuning functionalities as RITA that have been discussed 
in Section[T] 

3. DIVERGENT DESIGN TUNING: PROB- 
LEM STATEMENT 

In this section, we formalize the problem of divergent design 
tuning. The problem statement borrows several concepts from the 
original problem statement in |5| but also provides a non-trivial 
generalization. A comparison to the original study appears at the 
end of the section. 

3.1 Basic Definitions 

We consider a database comprising tables T\, . . . ,T n , An index 
configuration X is a set of indexes defined over the database tables. 
We assume that X is a subset of a universe of candidate indexes 
,y = 5?\ U ■ • ■ U y n , where S^i represents the set of candidate in- 
dexes on table 7). Each ,5^i represents a very large set of indexes and 
can be derived manually by the DBA or by mining the query logs. 
We do not place any limitations on the indexes regarding their type 
or the type or count of attributes that they cover, except that each 
index in X is defined on exactly one table (i.e., no join indexes). 

We use cost(q,X) to denote the cost of evaluating query q as- 
suming that X is materialized. The cost function can be evalu- 
ated efficiently in modern systems (i.e., without materializing X) 
using a what-if optimizer |4|. We define cost(u,X) similarly for 
an update statement u, except that in this case we also consider 
the overhead of maintaining the indexes in X due to the update. 
Following common practice 1131 16), we break the execution of u 
into two orthogonal components: (1) a query shell q se [ that selects 
the tuples to be updated, and (2) an update shell that performs the 
actual update on base tables and also updates any affected mate- 
rialized indexes. Hence, the total cost of an update statement can 
be expressed as cost(u,X) = cost(q se i,X) + Y2 a ex ucost(u,a)+c u , 
where ucost(u,a) is the cost to update index a with the effects of 
the update and can be estimated again using the what-if optimizer. 
The constant c u is simply the cost to update the base data which 
does not depend onX 

We consider a database that is fully replicated in N nodes, i.e., 
each node i <E [1,N] holds a full copy of the database. The replicas 
are kept synchronized by forwarding each database update to all 
replicas (lazily or eagerly). At the same time, a query can be evalu- 
ated by any replica. Since we are dealing with a multi-node system, 
we have to take into account the possibility of replicas failing. We 
use a to denote the probability of at least one replica failing. Set- 
ting this parameter can be done once in the beginning to the best 
of the DBA's ability and then it can be updated with easy statis- 
tics as the system is used (you adjust it based on the failure rate 
you see). To simplify further notation, we will assume that at most 
one replica can fail at any point in time. The extension to multiple 
replicas failing together is straightforward for our problem. 

We define W = Q U U as a workload comprising a set Q of query 
statements and a set U of update statements. Workload W serves as 
the representative workload for tuning the system. As is typical in 
these cases, we also define a weight function / : W — > 9? such that 
f(x) corresponds to the importance of query or update statement 
x in W. The input workload and associated weights can be hand- 
crafted by the DBA or they can be obtained automatically, e.g., by 



analyzing the query logs of the database system. 

3.2 Problem Statement 

At a high level, a divergent design allows each replica to have 
a different index configuration, tailored to a particular subset of 
the workload. To evaluate the query workload Q, an ideal strategy 
would route each q <E Q to the replica that minimizes the execu- 
tion cost for q. However, this ideal routing may not be feasible for 
several reasons, e.g., the replica may not be reachable or may be 
overloaded. Hence, the idea is to have several low-cost replicas 
for q, so as to provide some flexibility for query evaluation. For 
this purpose, we introduce a parameter m 6 [l,N], which we term 
routing multiplicity factor. Informally, for every query q £ Q, a di- 
vergent design specifies a set of m low-cost replicas that q can be 
routed to. The value of m is assumed to be set by the administrator 
who is responsible for tuning the system: m = 1 leads to a design 
that favors specialization; m = N provides for maximum flexibility; 
1 < m < N achieves some trade-off between the two extremes. 

Formally, we define a divergent design as a pair (I,h). The 
first component I = (/; , . . . ,1^) is an iV-tuple, where l r is the in- 
dex configuration of replica r e [1,JV]. The second component 
h= {liQ,h\,-- ■ ,hff) is a (N+ l)-tuple of routing functions. Specif- 
ically, ho() is a function over queries such that ho(q) specifies the 
set of m replicas to which q can be routed when all replicas are op- 
erational (i.e., there are no failures). Intuitively, ho(q) indicates the 
replicas that can evaluate q at low cost while respecting other con- 
straints (e.g., bounding load skew among replicas, which we discus 
later), and is meant to serve as a hint to the runtime query sched- 
uler. Therefore, a key requirement is that /io() can be evaluated 
on any query q and not just the queries in the training workload. 
The remaining functions h\,...,h^ have a similar functionality but 
cover the case when replicas fail: hjQ, for j S [ 1 , iV] , specifies how 
to route each query when replica j has failed and is not reachable. 
Notice that in this case there may be fewer than m replicas in hj(q) 
for any q e Q if the DBA has originally specified m = N. 

In order to quantify the goodness of a divergent design, we first 
use a metric that captures the performance of the workload under 
the normal operation when no running replica fails as follows. 



TotalCost(l,h) = Y^ Y^ J —^-cost(qJ r ) + 
q£Qr£h {q) 

5Z 5Z f{u)cost(u,Ii) 

ueUie\i,N\ 

The second term simply captures the cost to propagate each up- 
date u £ U to each replica in the system. The first summation cap- 
tures the cost to evaluate the query workload Q. We assume that 
q is routed uniformly among its m replicas in ho(q), and hence the 
weight of q is scaled by i/m for each replica. The intuition behind 
the TotalCost(l,h) metric is that it captures the ability of the diver- 
gent design to achieve both replica specialization and flexibility in 
load balancing with respect to m. 

To capture the case of failures, we define FTotalCost(l,h, j) as 
the performance of the workload when replica j <E [1,N] fails: 



FTotalCost(I,h,j) = Y^ V] 



/(<?) 



max{m,N — 1} 



cost(q,I r ) + 



qeQrehj(q) 

y"l 5Z f{u)cost{u,Ii) 

ueUie{l,-,N}-{j} 

The expression for FTotalCost (I,h, j) is similar to TotalCost(I,h), 
except that, since replica j is unavailable, the update cost on replica 
j is discarded and routing function hj is used instead of Iiq. 



We quantify the goodness of a divergent design (I, h) based on 
the expected cost of the workload, denoted as ExpTotalCost(l,h), 
by combining TotalCost(I,h) and FTotalCost(l,h, j) weighted ap- 
propriately. Recall that a is a DBA-specified probability that a fail- 
ure will occur. It follows that (1 —a) is the probability that all 
replicas are operational and hence the performance of the work- 
load is computed by TotalCost(l,h). Conversely, the probability of 
a specific replica j failing is a/N, assuming that all replicas can 
fail independently with the same probability. In that case, the cost 
of workload evaluation is FTotalCost(l,h,j). Putting everything 
together, we obtain the following definition for the expected work- 
load cost: 



ExpTotalCost(l, h) 



(l-a)-TotalCost(I,h) + 
J2 ^FTotalCost{I,hJ) 



Ml,N] 



Our assumption so far is that at most one replica can be inoper- 
ational at any point in time. The extension to concurrent failures 
is straightforward. All that is needed is extending h with routing 
functions for combinations of failed replicas, and then extending 
the expression of ExpTotalCost(l, h) with the corresponding cost 
terms and associated probabilities. 

We are now ready to formally define the problem of Divergent 
Design Tuning, referred to as DDT. 

PROBLEM 1. (Divergent Design Tuning - DDT) We are given 
a replicated database with N replicas, a workload W = Q U U, a 
candidate index-set 5? ', a set of constraints C, a routing multiplicity 
factor m, and a probability of failure a. The goal is to compute a 
divergent design (I, h) that employs indexes in ,5^ , satisfies the con- 
straints in C, and ExpTotalCost(l,h) is minimal among all feasible 
divergent designs. □ 

Constraints in DDT. The set of constraints C enables the DBA to 
control the space of divergent designs considered by the advisor. 
An intra-replica constraint specifies some desired property that is 
local to a replica. Examples include the following: 

• The size of Ij in I is within a storage-space budget. 

• Indexes in /,■ must have specific properties, e.g., no index can be 
more than 5-columns wide, or the count of multi-key indexes is 
below a limit. 

• The cost to update the indexes in L is below a threshold. 
Conversely, an inter-replica constraint specifies some property that 
involves all the replicas. Examples include the following: 

• If (If ,h c ) represents the current divergent design of the system, 
then ExpTotalCost(I,h) must improve on ExpTotalCost(l c ,h c ) 
by at least some percentage. 

• The total cost to materialize (I,h) (i.e., to build each Ij in each 
replica) must be below some threshold. 

• The load skew among replicas must be below some threshold. 
(We discuss this constraint in more detail shortly.) 

We will formalize later the precise class of constraints C that we can 
support in RITA. The goal is to provide support for a large class 
of practical constraints, while retaining the ability to find effective 
designs efficiently. 

Bounding load skew is a particularly important inter-replica con- 
straint that we examine in our work. The replica-specialization im- 
posed by a divergent design means that each replica may receive a 
different subset of the workload, and hence a different load. The 
ExpTotalCostQ metric does not take into account these different 
loads, which means that minimizing workload cost may actually 
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lead to a high skew in terms of load distribution. Our experiments 
verify this conjecture, showing that an optimal divergent design in 
terms of ExpTotalCostQ can cause loads at different replicas to 
differ by up to a factor of two. This situation, which is clearly 
detrimental for good performance in a distributed setting, can be 
avoided by including in C a constraint on the load skew among 
replicas. More concretely, the load of replica j under normal oper- 
ation can be computed as: 

load(l,hJ)= V M 
*■ — ' m 

qeQA j€h (q) 

We say that design (I, h) has load skew r > if and only if load(I, h, r) < 
( 1 + t ) ■ load(l, h, j) for any 1 < r ^ j < N. A low value is desirable 
for r, as it implies that (I, h) keeps the different replicas relatively 
balanced. 

We can define a load-skew constraint for the case of failures in 
exactly the same way. Specifically, we define flood (I,h,j,f) as the 
load of replica j when replica / fails. The formula offload(l, h, j,f) 
is similar to that of load(I,h,j) except that ho is replaced by hf. 
The constraint then specifies thaifload(l, j, f) < (1 + T r )fload{\, h,r,f) 
for any valid choice of j,r,f and a skew factor r' > 0. 

It is straightforward to verify that zero skew is always possible 
by assigning the same index configuration to each replica. One may 
ask whether there is a tradeoff between specialization (and hence 
overall performance) and a low skew factor. One of the contri- 
butions of our work is to show that this is not the case, i.e., it is 
possible to compute divergent designs that exhibit both good per- 
formance and a low skew factor. 

Theoretical Analysis. Computing the optimal divergent design 
implies computing a partitioning of the workload to replicas and 
an optimal index configuration per replica. Not surprisingly, the 
problem is computationally hard, as formalized in the following 
theorem. The proof is provided in Appendix lAl 

THEOREM 1. It is not possible to compute an optimal solution 
to DDT in polynomial time unless P = NP. 

3.3 Comparison to Original Study (H 

The formulation of DDT expands on the original problem state- 
ment in [5] in several non-trivial ways. First, DDT incorporates the 
expected cost under the case of failures into the objective function, 
whereas failures were completely ignored in [5 ]. Second, our for- 
mulation allows a much richer set of constraints C compared to the 
original study which considered solely intra-replica constraints. As 
discussed earlier, the omission of such constraints may lead to di- 
vergent designs with undesirable effects on the overall system, e.g., 
the load skew issue that we discussed earlier. Finally, the original 
problem statement imposed a restriction for ho(q) to correspond to 
the m replicas with the least evaluation cost for q, that is, Vq 6 Q 
and V/,7 £ [1, AT] such that i £ ho(q) and j ^ ho(q) it must be that 
cost(q,Ij) < cost(q,I j). We remove this restriction in our formula- 
tion in order to explore a larger space of divergent designs, which 
is particularly important in light of the richer class of constraints 
that we consider. 

4. DIVERGENT DESIGN TUNING AS BI- 
NARY INTEGER PROGRAMMING 

In this section, we show that the problem of Divergent Design 
Tuning (DDT) can be reduced to a Binary Integer Program (B1P) 
that contains a relatively small number of variables. The implica- 
tion is that we can leverage several decades of research in linear- 
optimization solvers in order to efficiently compute near-optimal 



divergent designs. Reliance on these off-the-shelf solvers brings 
other important benefits as well, e.g., simpler implementation and 
higher portability of the index advisor, or the ability to operate in 
"any-time" mode where the DBA can interrupt the tuning session at 
any point in time and obtain the best design computed thus far. We 
discuss these features in more detail in Section|5] when we describe 
the architecture of RITA. 

The remainder of the section presents the technical details of the 
reduction. We first review some basic concepts for fast what-if opti- 
mization, which forms the basis for the development of our results. 
We then present the reduction for a simple variant of DDT and then 
generalize to the full problem statement. 

4.1 Fast What-If Optimization 

What-if optimization is a principled method to estimate cost(q,X) 
and cost(u,X) for any q £ Q, u £ U and index set X, but it remains 
an expensive operation that can easily become the bottleneck in 
any index-tuning tool. To mitigate the high overhead of what-if op- 
timization, recent studies have developed two techniques for fast 
what-if optimization, termed INUM [11] and C-PQO |3| respec- 
tively, that can be used as drop-in replacements for a what-if opti- 
mizer. In what follows, we focus on INUM but note that the same 
principles apply for C-PQO. 

We first introduce some necessary notation. A configuration A C 
y is called atomic 1111 if A contains at most one index from each 
«yj, We represent A as a vector with n elements, where A[i] is an 
index from y or a symbol SCAN; indicating that no index of 5^\ is 
selected. For an arbitrary index set X, we use atom(X) to denote 
the set of atomic configurations in X. To simplify presentation, we 
assume that a query q references a specific table 7) with at most one 
tuple variable. The extension to the general case is straightforward 
at the expense of complicated notation. 

For each query q, INUM makes a few carefully selected calls to 
the what-if optimizer in order to compute a set of template plans, 
denoted as TPlans(q). A template plan p £ TPlans(q) is a physical 
plan for q except that all access methods (i.e., the leaf nodes of the 
plan) are substituted by "slots". Given a template p £ TPlans(q) 
and an atomic index configuration A, we can instantiate a concrete 
physical execution plan by instantiating each slot with the corre- 
sponding index in A, or a sequential scan if A does not prescribe an 
index for the corresponding relation. Figure [T]shows an example of 
this process for a simple query over three tables T\, T 2 , and Tj, and 
an atomic configuration that specifies an index on Ti and another 
index on 73. Each template is also associated with an internal plan 
cost, which is the sum of the costs of the operators in this plan ex- 
cept the access methods. Given an atomic configuration A, the cost 
of the instantiated plan, denoted as cost(p,A), is the sum of the 
internal plan cost and the cost of the instantiated access methods. 

The intuition is that TPlans(q) represents the possibilities for the 
optimal plan of q depending on the set of materialized indexes. 
Hence, given a hypothetical index configuration X, INUM esti- 
mates cost(q,X) as the minimum cost(p,A) over p £ TPlans(q) 
and A £ Atom(X). Note that a slot in p may have restrictions on 
its sorted order, e.g., the template plan in Figure Q~]prescribes that 
the slot for T\ must be accessed in sorted order of attribute x. If A 
does not provide a suitable access method that respects this sorted 
order, then cost(p,A) is set to °o, INUM guarantees that there is 
at least one plan p in TPlans(q) such that cost(p,A) < <*> for any 
A £ Atom(X). As shown in the original study [11], INUM provides 
an accurate approximation for the purpose of index tuning, and is 
orders-of-magnitude faster compared to conventional what-if opti- 
mization. 

Linear composability. The approximation provided by INUM and 



(b) Template Plan 



(c) Instantiated Plan based on 
the atomic configuration A 



SELECT T 2 .x 
FROM T,,T 2 , T 3 
WHERE T,.x = T 3 .yAND 
T,.x = T 2 .z 




Figure 1: Example of template plans and instantiated plans. 
The configuration A has the following contents: A[l] = a, an 

index with key T\ .x; A [2] = SCAN2; A [3] = b, an index with key 
(T 2 jc,T 2 .w) ® 



C-PQO can be formalized in terms of a property that is termed 
linear composability in (6). 

Definition 1 (Linear composability [6]). Function cost () 
is linearly composable for a select-statement q if there exists a 
set of identifiers K q and constants ftp and j pa for p £ K q , a £ 
y U {SCAN 1 } U • ■ • U {SCAN n } such that: 

cost(q,X) = min{j3 p +/]"fpa,p £ K q ,A eAtom(X)} 

aeA 

for any configuration X. Function cost{) is linearly composable 
for an update-statement q if it is linearly composable for its query 
shell. a 

It has been shown in |6| that both INUM and C-PQO compute a 
cost function that is linearly composable. For INUM, K q = TPlans(q) 
and each p corresponds to a distinct template plan in TPlans(q). 
Here, we use TPlans(q) for the set of identifiers and overload p £ 
TPlans(q) to represent an identifier. In turn, the expression j3 p + 
X^7pa corresponds to cost(p,A), where j3 p denotes the internal 
plan cost of p, and j pa is the cost of implementing the correspond- 
ing slot in p using index a. (The slot covers the relation on which 
the index is defined.) Note that linear composability does not im- 
ply a linear cost model for the query optimizer - non-linearities are 
simply hidden inside the constants f3 qp . 

For the remainder of the paper, we assume that cost(q,X) is com- 
puted by either INUM or C-PQO (for the purpose of fast what-if 
optimization) and hence respects linear composability. 

4.2 Basic DDT 

In this subsection, we discuss how to reduce DDT to a compact 
BIP for the case when a = 0, C = (i.e., no failures and no con- 
straints) and the workload comprises solely queries, i.e., W = Q. 
This reduction forms the basis for generalizing to the full problem 
statement, which we discuss later. 

BIP formulation. At a high level, we are given an instance of 
DDT and we wish to construct a BIP whose solution provides an 
optimal divergent design. This reduction will hinge upon the linear 
composability property, i.e., we assume that each query q £ W has 
been preprocessed with INUM and therefore we can approximate 
cost(q,X) for any X C y as expressed in DefinitionQ] 

Figure [2] shows the constructed BIP. (Ignore for now the boxed 
expressions.) In what follows, we will explain the different com- 
ponents of the BIP and also formally state its correctness. The BIP 
uses two sets of binary variables to encode the choice for a diver- 
gent design (I,h): 
• Variable s' a is set to 1 if and only if index a is part of the index 

design I r on replica r. In other words, I r = {a | s r a = 1}. 



Minimize: TotalCost (l,h) = QueryCost(l,h) +UpdateCost(l,h) , 
where: 

QueryCost{l,h) = Y Y cost{q,r) 

qeQre[l,N] 



UpdateCost (I,h) = 
+ 


Y Y f(9)cost(9,r) 

l£Q„l,jr£[l,N] 

Y Y f(uV a -ucost(u,a) 

ueU re[l,N] 



t(q,r)= y Ppy r P+ Y 

p£TPlans(q) p£TPlans(q) 

flG.y'U{SCANj }U —U{SCAN„} 



r Vre[lJV], 
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such that: 
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S a > x' pa , VqeQU Q upd , p e TPlans (q),a€3? 



Y *, 



_ , V q eQVQ upd ,p£TPlans(q), 



(1) 
(2) 

(3) 

(4) 

(5) 
(6) 



pa /p> j6[l,n], T, is referenced in q 
ae^UfSCAN;} 

Figure 2: The BIP for Divergent Design Tuning. 



• Variable f q is set to 1 if and only if query q is routed to replica r, 
i.e., r £ ho(q). (Recall that we ignore failures for now.) In other 
words, ho(q) = {r | t r „ = 1}. 

Under our assumption of using fast what-if optimization, the cost 
of a query q in some replica r can be expressed as cost(q,l r ) = 
cost(p'.A') for some choice of p' £ TPlans(q) and an atomic con- 
figuration A' £ Atom(I r ). To encode these two choices, we intro- 
duce two different sets of binary variables: 

• Variable x r pa , where p is a template in TPlans(q) and a is an index 
in ,yu{SCAN! } U • ■ ■ U {SCAN„}, is equal to 1 if and only if p = 
p' and a £ A'. 

• Variable y„ = 1 if and only if p = //. 

The BIP specifies several constraints that govern the valid value 
assignments to the aforementioned variables: 

• Constraint ([2j specifies that query q must be routed to exactly m 
replicas. 

• Constraint © specifies that there must be exactly one variable 
y„ set to \\if q = 1, i.e., exactly one template p chosen for com- 
puting cost(q,I r ) if q is routed to r. Conversely, y r „ = for all 
templates pift r = 0. 

• Constraint ((5} specifies that an index a can be used in instantiat- 
ing a template p at replica r only if it appears in the corresponding 
design /,-. 

• Constraint (|6} specifies that if y r „ = 1, i.e., p is used to compute 
cost(q,I r ), then there must be exactly one access method a per 
slot such that x r pa = 1 . Essentially, the choices of a for which 
Xp a = 1 must correspond to an atomic configuration. Conversely, 
x r pa = for all a if y r p = 0. 

Given these variables, we can express cost(q,I r ) as in Equation[T] 
in Figure [2] The equation is a restatement of linear composability 
(Definition!]} by translating the minimization to a guarded summa- 



tion using the binary variables y r „ and x' pa . Specifically, \if„ = 1, 
then constraint ((4) forces the solver to pick exactly one p such 
that Vp = 1 , and constraint (|6) forces setting x r pa = 1 for the same 
choice of p and corresponding to an atomic configuration. Hence, 
minimizing the expression in Equation[Tjcorresponds to computing 
cost(q,l r ). Otherwise, if t T „ = 0, then the same constraints force 
cost(q,I r ) = 0. In turn, it follows that the objective function of the 
BIP corresponds to TotalCost(l,h). 

Handling update statements. The total cost to execute update 
statements, UpdateCost (I,h), includes two terms, as shown in the 
second boxed expression in Figure [2] Here, Q upc i denotes the set 
of all the query-shells, each of which corresponds to each update 
statement in U. The first component of UpdateCostQ is the total 
cost to evaluate every query-shell in Q up d at every replica. This 
component is expressed as the summation of cost(q, r) for all q se i £ 
Qupd an d r £ [1,^V] in our BIP. Since each query-shell needs to be 
routed to all replicas, we impose the constraint (|3). 

The second component of UpdateCost () is the total cost to up- 
date the affected indexes. Using variable s' a that tracks the selection 
of an index at replica r in the recommended configuration, the cost 
of updating an index a at replica r given the presence of an update 
statement ;( is computed as the product of s r a and ucost(u, a) . 

Correctness. Up to this point, we argued informally about the 
correctness of the BIP. The following theorem formally states this 
property. The proof is given in Appendix [B] 

THEOREM 2. A solution to the BIP in Figure\2\corresponds to 
the optimal divergent design for DDT when a = and C = 0. 

As stated repeatedly, the key property of the BIP is that it con- 
tains a relatively small number of variables and constraints, which 
means that a BIP-solver is likely to find a good solution efficiently. 
Formally: 

COROLLARY 1 . The number of variables and constraints in the 
BIP shown in Figure\2\is in the order of 0{N\W\\y\). 

In fact, it is possible to eliminate some variables and constraints 
from the BIP while maintaining its correctness. We do not show 
this extension since it does not change the order of magnitude for 
the variable count but it makes the BIP less readable and harder to 
explain. 

4.3 Factoring Failures 

To extend the BIP to the case when a > (i.e., failures are pos- 
sible), we first introduce additional variables tq J , y r p and x r pa , for 
j £ [l,N]. These variables have the same meaning as their coun- 
terparts in Figure [2] except that they refer to the case where replica 
j fails. For instance, tq J = 1 if and only if q is routed to replica r 

when j fails, i.e., hj(q) = {r | t' c j J = 1}. We augment the BIP with 
the corresponding constraints as well. For instance, we add the con- 
straint ~}2 r u;tq J = max{A'— l,m}, Vg SQ,j £ [l,N] to express the 
fact that function hj() must respect the routing-multiplicity factor 
m. Finally, we change the objective function to ExpTotalCostQ, 
which is already linear, and express each term FTotalCost(l,h, j) 
as a summation that involves the new variables. 

The complete details for this extension, including the proof of 
correctness, can be found in Appendix [C] We should mention that 
this extension increases the number of variables and constraints by 
a factor of N to 0(N 2 \W\ \y\), since it becomes necessary to reason 
about the failure of every replica j £ [l,N]. 

4.4 Adding Constraints 



In this subsection, we discuss how to extend the BIP when C 7^ 0, 
i.e., the DBA specifies constraints for the divergent design. 

Obviously, we can attach to the BIP any type of linear constraint. 
As it turns out, linear constraints can capture a surprisingly large 
class of practical constraints. In what follows, we present three 
examples of how to translate common constraints to linear expres- 
sions that be directly added to the BIP. 

Space budget. Let size(a) denote the estimated size of an index 
a, and b be the storage budget at each replica. Using the vari- 
able s r a that tracks the selection of an index at replica r in the rec- 
ommended configuration, the storage constraint can be encoded as: 
^2 a( zy s r a size(a) < b, Vr e [ 1 , JV] . In general, variables s£ can be 
used to express several types of intra-replica constraints that in- 
volve the selected indexes, e.g., bound the total number of multi- 
key indexes per replica, or bound the total update cost for the in- 
dexes in each replica. 

Bounding load-skew. Recall that load(I,h,j) captures the total 
load of replica j under a divergent design (I,h). The load-skew 
constraint specifies that load(l,h,j) < (1 +T)load(l,h,r), for any 
r 7^ j, where r is the load-skew factor provided by the DBA. 

It is straightforward to translate the constraint between two spe- 
cific replicas j and r into a linear inequality, by using variables x r pa 
and y r „ to rewrite the corresponding loadQ terms as linear sums. 
Specifically, load(l, h,j) can be expressed as a linear sum similarly 
to TotcuCostQ in Figure [2] except that we only consider replica j 
and the queries for which j e ho(q), and the same goes for express- 
ing load(l,h,r). 

Based on this translation, we can add N(N — 1 ) constraints to the 
BIP, one for each possible choice of j and r. We can actually do 
better, by observing that we can sort replicas in ascending order of 
their load, and then impose a single load-skew constraint between 
the first and last replica. By virtue of the sorted order, the constraint 
will be satisfied by any other pair of replicas. Specifically, we add 
the following two constraints to the BIP: 

load(I,h,i) < load(I,h,i+l), Vie [l.Af — 1] (7) 

load(I,h,N) < (l+T)-load(l,h,l) (8) 

This approach requires only N constraints and is thus far more ef- 
fective. 

The final step requires adding another set of constraints on cost(q, I r ) 
This is a subtle technical point that concerns the correctness of the 
reduction when the constraints are infeasible. More concretely, the 
solver may assign variables y r and x r for some query q so that 
constraints |[7}h[8} are satisfied even though this assignment does 
not correspond to the optimal cost cost(q,I r ). To avoid this situa- 
tion, we introduce another set of variables that are isomorphic to 
x r „ a and are used to force a cost-optimal selection for y r „ and x r „ a . 
The details are given in Appendix ID. 11 but the upshot is that we 
need to add 0(iV|W| \5?\) additional constraints. 

We have also developed an approximate scheme to handle load- 
skew constraints in the BIP. The approximate scheme allows the 
BIP to be solved considerably faster, but the compromise is that 
the resulting divergent design may not be optimal. However, our 
experimental results (see Section [6} suggest that the loss in quality 
is not substantial. The details of the approximate scheme can be 
found in Appendix ID. 21 

Materialization cost constraint. This constraint specifies that the 
total cost to materialize (I,h) must be below some threshold C, n . 
The materialization cost is computed with respect to the current 
design (I e ,h f ) and takes into account the cost to scale up or down 
the current number of replicas, and the cost to create additional 
indexes or drop redundant indexes in each replica. 



We first consider the case when the number of replicas remains 
unchanged between (I,h) and (I f ,h c ). Let us consider a specific 
replica r and the new design /,• e I. Let l c r e I f denote the previ- 
ous design. Clearly, we need to create every index in /,- — l c r and to 
delete every index in l c r — I r . Assuming that ccost(a) and dcost(a) 
denote the cost to create and drop index a respectively, we can ex- 
press the reconfiguration cost for replica r as ^2 a dje s r a ccost(a) + 
^ flG/r (l — s r a )dcost(a). If each replica can install indexes in paral- 
lel, then the materialization cost constraint can be expressed as: 

y^ s r a ccost(a)+ ^ (I - s r a )dcost(a) < C m ,Vr e [1,JV] 

We can also express a single constraint on the aggregate material- 
ization cost by summing the per-replica costs. 

We next consider the case when the DBA wants to shrink the 
number of replicas to be Nj < N. In this case, the BIP solver should 
try to find which replicas to maintain and how to adjust their index 
configurations so that the total materialization cost remains below 
threshold. For this purpose, we introduce N new binary variables 
f with re [l,N] associated with each replica r, where f = 1 if 
replica r is kept in the new divergent design, and z r = other- 
wise. The materialization cost can be computed in a similar way 
as discussed above, except that we need to add the following two 
additional constraints to the BIP. 



t' q <z r .y q eQ\JQ upd ,re[\,N] 



re[l,iV] 



-N d 



(9a) 
(9b) 



The first constraint ensures that we can route queries only to live 
replicas. The second simply restricts the number of live replicas to 
the desired number. 

Lastly, we consider the case when the DBA wants to expand the 
number of replicas to be N4 > N. The set of constraints in the BIP 
can be re-used except that all the variables are defined according to 
Nj replicas (instead of N replicas as before). The materialization 
cost can also be computed in a similar way. In addition, we also 
take into account the cost to deploy the database in new replicas, 
which appear as constants in the total cost to materialize a design 
in a new replica. 

4.5 Routing Queries 

Recall that a divergent design (I, h) includes both the index-sets 
for different replicas and the routing functions h^Q,h\ (),•■•> AjvO- 
These functions are used at runtime, after the divergent design has 
been materialized, to route queries to different specialized replicas. 
A solution to the BIP determines how to compute these functions 
for a training query q in Q, based on the variables t r q and tq . Here, 
we describe how to compute these functions for any query q' that 
is not part of the training workload. We focus on the computation 
of ho(q') but our techniques readily extend to the other functions. 

Our first approach is inspired by the original problem statement 
of the tuning problem |5| and computes ho(q') as the m replicas 
with the lowest evaluation cost for q'. Normally this requires N 
what-if optimizations for q', but we can leverage again fast what-if 
optimization in order to achieve the same result more efficiently. 
Specifically, we first compute TPlans(q') (which requires a few 
calls to the what-if optimizer) and then formulate a BIP that com- 
putes the top m replicas for q'. 

Our second approach tries to match more closely the revised 
problem statement, where a query is not necessarily routed to its 
top m replicas. Our approach is to match q' to its most "similar" 
query q in the training workload Q, and then to set ho(q') = ho(q). 
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module outputs the difference between the two as the performance 
improvement if (I alldi: ,li sl ' d ' : ) were materialized. This output, which 
is essentially a time series since (I sl " ic ,h slldl: ) is being continuously 
updated, can inform the DBA about the need to retune the system. 

Clearly, it is important for the online monitor to maintain (F' l(k , h slul 
up-to-date with the latest statements in the workload. For this pur- 
pose, the online monitor solves a bare-bones variant of DDT that 
assumes a = (i.e., no failures) and does not employ any con- 
straints except perhaps very basic ones (e.g., a space budget per 
replica). Beyond being fast to solve, this formulation also reflects 
the best-case potential to improve performance, which again can 
inform the DBA about the need to retune the system. RITA allows 
the DBA to impose additional constraints inside the online monitor 
at the expense of taking longer to update the output of the online 
monitor. 

5.2 Recommender 

The DBA invokes the recommender module to run a tuning ses- 
sion, for the puipose of tuning the initial divergent design or re- 
tuning the current design when the workload changes. The DBA 
provides an instance of the DDT problem, e.g., a training work- 
load, the parameter a and several constraints, and the recommender 
returns the corresponding (near-)optimal divergent design. The rec- 
ommender leverages the BIP-based formulation of DDT in order to 
compute its output efficiently. 

If desired by the DBA, the recommender can also return a set 
of possible designs that represent trade-off points within a multi- 
dimensional space. For example, suppose that the DBA specifies 
the workload-evaluation cost and the materialization cost of each 
design as the two dimensions of this space. We expect that a de- 
sign with a higher materialization cost will have more indexes, 
and hence will have a lower workload-evaluation cost. The rec- 
ommender formulates a BIP to compute an optimal divergent de- 
sign that does not bound the materialization cost. The solution 
provides an upper bound on materialization cost, henceforth de- 
noted as C m . Subsequently, the recommender formulates several 
tuning BIPs where each BIP puts a different threshold on the mate- 
rialization cost based on C m and some factor (e.g., materialization 
cost should not exceed .5 x C m ). The thresholds for these Pareto- 
optimal designs can be predefined or chosen based on more in- 
volved strategies such as the Chord algorithm [7|. An important 
point is that the successive BIPs are essentially identical except for 
the modified constraint on the materialization cost, which enables 
the BIP solver to work fast by reusing previous computations. 

The DBA can also add other parameters into this exploration. 
For example, adding the number of replicas as another parameter 
will cause the recommender to use the same process to generate de- 
signs for the hypothetical scenarios of expanding/shrinking the set 
of replicas. The final output can inform the DBA about the trade-off 
between workload-evaluation cost and design-materialization cost, 
and how it is affected by the number of replicas. 

Besides being able to perform tuning sessions efficiently, RITA's 
recommender module gains two important features through its re- 
liance on a BIP solver. 
• Fast refinement. As mentioned earlier, the BIP solver can reuse 

computation if the current BIP is sufficiently similar to previously 

solved BIPs. RITA takes advantage of this feature to offer fast 

refinement of the solution for small changes to the input. E.g., 

against the current design (F urr ,rr mr ) of the system, using the ExpTotaICost() t h e optimal divergent design can be updated very efficiently if 

metric of each design on the workload in the sliding window. The me DBA wishes to change the set of candidate indexes or impose 

7~ ; ~ ; ; " , „ r „ additional inter-replica constraints. 

Any vector-similarity metric will do. We first convert v q < v q to 

binary vectors indicating which indexes are used at each replica • Early termination. In the course of solving a BIP, the solver 

and then use a cosine-similarity metric. maintains the currently-best solution along with a bound on its 



Figure 3: The architecture of RITA. 



The intuition is that the two queries would affect the divergent de- 
sign similarly if they were both included in the training workload. 
We can use several ways to assess similarity, but we found that fast 
what-if optimizations provides again a nice solution. Specifically, 
we compute again TPlans(q') and then quickly find the optimal 
plan for q' in each replica. We then form a vector v q i where the i-th 
element is the set of indexes in the optimal plan of q' at replica ;'. 
We can compute a similar vector for v q and then compute the simi- 
larity between q' and q as the similarity between the corresponding 
vector^]. The intuition is that q' is similar to q if in each replica they 
use similar sets of indexes. We can refine this approach further by 
taking into account the top-2 plans for each query, but our empirical 
results suggest that the simple approach works quite well. 

5. rita: ARCHITECTURE AND FUNCTION- 
ALITY 

In this section we describe the architecture and the functionality 
of RITA, our proposed index-tuning advisor. RITA builds on the 
reduction presented in the previous section in order to offer a rich 
set of features. 

Figure|3]shows the architecture of RITA. It comprises two main 
modules: the online monitor, which continuously analyzes the 
workload in order to detect changes and opportunities for retun- 
ing; and the recommender, which is invoked by the DBA in order 
to run a tuning session. As we will see later, both modules solve 
a variant of the DDT problem in order to perform their function. 
Also, both modules make use of the reduction we presented in the 
previous section in order to solve the respective tuning problems. 
For this purpose, they employ an off-the-shelf BIP solver. The re- 
maining sections discuss the two modules in more detail. 

5.1 Online Monitor 

The online monitor maintains a divergent design (F^ji* 1 " 10 ) that 
is continuously re-computed based on the latest queries in the work- 
load. Concretely, the monitor maintains a sliding window over 
the current workload (the length of the window is a parameter de- 
fined by the DBA) and then solves DDT using the sliding win- 
dow as the training workload. Each new statement in the running 
workload causes an update of the window and a re-computation of 
(F ,i *,h s,i *). 

Once computed, the up-to-date design (F sliik h 51 " 1 ') ; s compared 



Parameter 



Values 



Number of replicas (N) 
Routing multiplicity (m) 
Space budget (b) 
Prob. of failure (a) 
Load skew (r) 
Percentage-update 
Sliding window (w) 



2, 3, 4, 5 

1,2,3 

0.25x,0.5x, l.Ox.INF 

0.0,0.1,0.2,0.3,0.4 

1.3, 1.5, 1.7, 1.9, 2.1, INF 

10- 5 , 10- 4 , 10" 3 , io- 2 

40, 60, 80, 100 



Table 1: Experimental parameters (default in bold). 



suboptimality. This information can be leveraged by RITA to 
support early termination based on time or quality. For instance, 
the DBA may instruct the recommender to return the first solution 
that is within 5% of the optimal, which can reduce substantially 
the total running time without compromising performance for the 
output divergent design. Or, the DBA may ask for the best solu- 
tion that can be computed within a specific time interval. 

6. EXPERIMENTAL STUDY 

This section presents the results of the experimental study that 
we conducted in order to evaluate the effectiveness of RITA. In 
what follows, we first discuss the experimental methodology and 
then present the findings of the experiments. 

6.1 Methodology 

Advisors. Our experiments use a prototype implementation of 
RITA written in Java. The prototype employs CPLEX vl2.3 as the 
off-the-shelf BIP solver, and a custom implementation of INUM 
for fast what-if optimization. The database system in our exper- 
iments is the freely available IBM DB2 Express-C. The CPLEX 
solver is tuned to return the first solution that is within 5% of the 
optimal. In all experiments, we use p RITA to denote the divergent 
design computed by RITA. 

We compare RITA against the heuristic advisor DlVGDESIGN 
that was introduced in the original study of divergent designs [ 5 1 . 
DlVGDESIGN employs IBM's physical design advisor internally. 
Similar to [5 1, we run DlVGDESIGN five times and output the lowest- 
cost design out of all the independent runs. We denote this final 
design as p DD . We note that the comparison against DlVGDESIGN 
concerns only a restricted definition of the general tuning problem, 
since DlVGDESIGN supports only a space budget constraint and 
does not take into account replica failures. 

We also include in the comparison the common practice of us- 
ing the same index configuration with each replica. The identical 
configuration is computed by invoking the DB2 index-tuning advi- 
sor on the whole workload. We use p UNIF to refer to the resulting 
design. 

Data Sets and Workloads. We use a 10GB TPC-DS database (B) 
for our experiments, along with three different workloads, namely 
TPCDS-query, TPCDS-mix and TPCDS-dyn. TPCDS-query com- 
prises 40 complex TPC-DS benchmark queries that are currently 
supported by our INUM implementation |16|. TPCDS-mix adds 
INSERT statements that model updates to the base data. TPCDS-dyn 
models a workload of 600 queries that goes through three phases, 
each phase corresponding to a specific distribution of the queries 
that appear in TPCDS-query. The first phase corresponds mostly to 
queries of low execution cosfl then the distribution is inverted for 
the second phase, and reverts back to the starting distribution in the 
first phase. 



The execution cost is measured with respect to the optimal index- 
set for each query returned by the DB2 advisor. 



In all cases, the weight for each query is set to one, whereas 
the update of each INSERT statement is determined as the product 
of the cardinality of the corresponding relation and a percentage- 
update parameter. This parameter allows us to simulate different 
volumes of updates when we test the advisors. 

Note that the size of the database does not affect the trends ob- 
served in our experiments, as all performance metrics are based on 
the DB2 optimizer's cost model (more on this later). 

Candidate Index Generation. Recall from Section[3]that the DDT 
problem assumes that a set of candidate indexes J? is provided as 
input. There are many methods for generating S^ based on the 
database and representative workload. In our setting, we use DB2's 
service to select the optimal indexes per query (without any space 
constraints) and then perform a union of the returned index- sets. 
The resulting index-set, which is optimal for the workload in the 
absence of constraints and update statements, contains 108 candi- 
date indexes and has a total size of 15GB. 

Experimental Parameters. Our experiments vary the following 
parameters: the number of replicas N, the per-replica space budget 
b, the probability of failure a, the load-skew factor r, the percent- 
age of updates in the workload (for TPCDS-mix), and the size of the 
sliding window w for online monitoring. The routing multiplicity 
factor (m) is always set to be \N/2] . Table Q] shows the parameter 
values tested in our experiments. Note that the storage space budget 
is measured as a multiple of the base data size, i.e., given TPCDS 
10 GB base data size, a space budget of 0.25 x indicates a 2.5 GB 
storage space budget. 

Metrics. We use ExpTotalCostQ to measure the performance of 
a divergent design. To allow meaningful comparisons among the 
designs generated by different advisors, we compute this metric for 
a specific design by invoking DB2's what-if optimizer for all the 
required cost factors. This methodology, which is consistent with 
previous studies on physical design tuning, allows us to gauge the 
effectiveness of the divergent design in isolation from any estima- 
tion errors in the optimizer's cost models. In some cases, we also 
report the performance improvement of /? RITA over p DD and Punif, 
where the performance improvement of a design X over a design 
Y is computed as 1 — ExpTotalCostiX) / ExpTotalCost(Y) . We also 
report the time that is taken to execute the index advisor for the 
corresponding divergent design. 

Testing Platform. All measurements are taken on a machine run- 
ning 64-bit Ubuntu OS with 1.83GHz eight-core processor and 
6GB RAM. 

6.2 Results 

Basic Tuning Problem. We first consider a basic case of DDT 
when q = and r = +°o, i.e., no failures occur and there is no con- 
straint on load skew. There is a single constraint on the divergent 
design which is the per-replica space budget. This setting corre- 
sponds essentially to the original problem statement in [5 1. 

We begin with a set of experiments that evaluates the perfor- 
mance of RITA and the competitor advisors on the query-only 
workload TPCDS-query. In this case indexes can only bring bene- 
fit to queries, and hence the only restraint in materializing indexes 
comes from any constraints. Figure|4]shows the performance of the 
divergent designs computed by RITA, DlVGDESIGN, and UNIF, 
as we vary the space budget parameter. (All other parameters are 
set to their default values according to Table [T]) The results show 
that RITA consistently outperforms the other two competitors for 
a wide range of space budgets. The improvement is up to 40% over 
p UNIF and up to 30% for p DD . Another way to view these results is 
that RITA can make much more effective usage of the aggregate 
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disk space for indexes. For instance, p RITA at b — 0.5 x matches 
the performance of p UNIF at b = 1.0 x, i.e., with double as much 
space for indexes. In all cases, RITA's better performance can be 
attributed to the fact that it searches a considerably larger space of 
possible designs, through the reduction to a BIP. As the space bud- 
get increases, the performance of p RnA , Pdd and Pum F converge as 
all beneficial indexes can be materialized in every design. 

We next examine the performance of RITA and the competitor 
advisors on a workload of queries and updates. Figure [5]reports the 
performance of p RITA , p DD and p UNIF for the workload TPCDS-mix, 
as we vary the number of replicas in the system. We chose this 
parameter as updates have to be routed to all replicas and hence 
it controls directly the total cost of updates. We observe that the 
improvement of RITA over UNIF is in the order of 50% and the 
improvement of RITA over DlVGDESIGN is 35%. Not surpris- 
ingly, the improvements increase with the number of replicas. The 
reason is that RITA is able to find designs with much fewer in- 
dexes per replica compared to Punif and p DD , which contributes to 
a lower update cost. For instance for N = 3 and b = 0.5 x, the 
number of indexes per replica of p RrrA is (50,47,34) compared to 
(88,88,88) for Punif and (67,66,65) for p DD . We conducted sim- 
ilar experiments with different weights for the update statements 
and observed similar trends. 

The next experiment examines how RITA's advanced function- 
ality can control even further the cost of updates. Instead of hav- 
ing RITA minimize the combined cost of queries and updates, we 
instruct the advisor to perform the following constrained optimiza- 
tion: minimize query cost such that update cost is at most x% of the 
update cost of a uniform design. Essentially, the desire is to make 
updates much faster compared to the uniform design, and also try 
to get some benefits for query processing. This changed optimiza- 
tion requires minimal changes to the underlying BIP: the objective 
function includes only the cost of evaluating queries, and the con- 
straints include an additional linear constraint on the total update 
cost based on the update cost of the uniform design (which can be 
treated as a constant). The ease by which we can support this ad- 
vanced functionality reflects the power of expressing DDT as a BIP. 

Figure [6] depicts the cost of the query workload under p RrrA as 
we vary the factor that bounds the update cost relative to pump. 
For comparison we also show the cost of the query workload for 
p UNIF . The results show clearly that the designs computed by RITA 
can improve performance dramatically even in this scenario. As a 
concrete data point, when the bounding factor is set to 0.4, p RITA 
makes query evaluation 33% cheaper compared to p UNIF and incurs 
an update cost that is less than half the update cost of Punif- 

Overall, our results demonstrate that RITA clearly outperforms 
its competitors on the basic definition of the divergent-design tun- 
ing problem. From this point onward, we will evaluate RITA's 



effectiveness with respect to the generalized version of the prob- 
lem (i.e., including failures and a richer set of constraints). In the 
interest of space, we present results with query-only workloads, as 
the trends were very similar when we experimented with mixed 
workloads. 

Factoring Failures. We first evaluate how well RITA can tailor 
the divergent design in order to account for possible failures, as 
captured by the failure probability a. 

Figure [PJl shows the ExpTotalCostQ metric for p RnA , Pdd and 
Pump as we vary the probability of failure a. There are two in- 
teresting take-away points from the results. The first is that p RITA 
has a relatively stable performance as we vary a. Essentially, we 
can reap the benefits of divergent designs even when there is an 
increased probability of failure in the system, as long as there is a 
judicious specialization for each replica and a controlled strategy to 
redistribute the workload (two things that RITA clearly achieves). 
The second interesting point is that the gap between p RITA and p DD 
increases with a. Basically, p DD ignores the possibility of failures 
(i.e., it always assumes that a = 0) and hence the computed design 
p DD cannot handle effectively a redistribution of the workload when 
a replica becomes unavailable. As a side note, the cost of p UNIF is 
unchanged for different values of a, since each query has the same 
cost under p UNIF on all replicas, and hence a redistribution of the 
workload does not change the total cost. 

Bounding Load Skew. We next study how RITA handles a (inter- 
replica) constraint on load skew. Recall that the constraint has the 
following form: for any two replicas, their load should not differ by 
a factor of more than 1 + r , where r > is the load-skew parameter. 
A balanced load distribution is important for good performance in 
a distributed system and hence we are interested in small values for 
r. The ability to satisfy such constraints is part of RITA's novel 
functionality. 

Figure[8]shows the performance of p RnA , Pdd and Punif as we vary 
parameter r that bounds the load skew (recall that r = implies no 
skew). We report two sets of results for RITA corresponding to 
a = (no failures) and a = 0.1 (10% chance that one replica will 
fail) respectively, in order to examine the interplay between a and 
r. Note that we report the results for the greedy version of RITA, 
which are identical to the exact solution of the constraint. The chart 
shows a single point corresponding to p DD , given that it is not pos- 
sible to constrain load skew within DlVGDESIGN. As shown, p DD 
has a significant load skew of up to a 2x difference between repli- 
cas. This magnitude of skew limits severely the ability of the sys- 
tem to maintain a balanced load and to route queries effectively. In 
contrast, RITA is able to compute designs that maintain a low ex- 
pected cost (up to 37% improvement compared to UNIF) and also 
satisfy the bound on load skew. These savings are not affected by 
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Table 2: The average running time of RITA (in seconds) 



the value of a-RITA is again able to make a judicious choice for 
the divergent design in order to satisfy all constraints and handle 
failures. Note that the uniform design trivially satisfies the load- 
skew constraint for all values of r as every replica has the same 
design and hence the system can be perfectly balanced. 

Running Time. Given an instance of the basic DDT problem (a = 
0, t = +°o), RITA spends 180 seconds to initialize INUM, a step 
that is dependent solely on the input workload, and then requires 
only four seconds to formulate and solve the resulting BIP. An im- 
portant point is that the initialization step can be reused for free if 
the workload remains unchanged, e.g., if the DBA runs several tun- 
ing sessions using the same workload but different constraints each 
time. Each subsequent tuning session can thus be executed in the 
order of a few seconds, offering an almost interactive response to 
the DBA. 

Table [2] shows the running time for RITA as we vary the load- 
skew factor and the probability of failure, two parameters that cor- 
respond to novel features of our generalized tuning problem. Note 
that the time to initialize INUM remains the same as before and is 
excluded from all the cells of the table. Clearly, the new features 
complicate the tuning problem and hence have an impact on run- 
ning time. Still, even for the most complex combination (r > 
and a > 0) RITA has a reasonable running time of at most three 
minutes. Moreover, as noted in Section [5] RITA can always be 
invoked with a time threshold and return the best design that has 
been identified within the allotted time. 

Routing. The next set of experiments examines the effectiveness 
of the routing scheme we introduced in Section 14.51 which deter- 
mines how to route unseen queries (i.e., queries not in W for which 
the routing functions hj cannot be applied) to "good" specialized 
replicas. 

Our test methodology splits TPCDS-query into two (sub)workloads: 
(1) a training workload that plays the role of W and consists of 30 
randomly-chosen queries of TPCDS-query, and (2) a testing work- 
load that plays the role of the unseen queries and consists of the 
remaining 10 queries. We compute a divergent design p RITA for the 
training workload, and route the queries in TPCDS-query (includ- 
ing both seen and unseen queries) assuming p RITA is deployed. For 
comparison, we apply the same methodology to the uniform de- 
sign: we first derive p UNIF for the training workload and then route 



the queries in TPCDS-query workload in round-robin fashion. We 
repeat this experiment for ten independent runs, where each run 
involves a different random split of the workload. 

Figure [9] shows the expected cost of the workload for p RITA and 
Punif for each run. The results show that RITA outperforms UNIF 
consistently, even though replica specialization does not take into 
account the unseen queries. The improvements vary across differ- 
ent runs depending on the choice of the workload split, but overall 
we can reap the benefits of divergent designs even with incomplete 
knowledge of the workload. 

Online Monitoring. The aforementioned routing scheme can help 
the system cope with unseen queries, but at some point it may be- 
come necessary to retime the divergent design if the actual work- 
load is substantially different than the training workload. The next 
experiment evaluates the online-monitoring module inside RITA 
which is designed for the task of detecting workload changes. 

We assume that the system receives the dynamic workload TPCDS-dyn, 
which shifts to a different query distribution after query 200 and 
then shifts back to the original distribution at query 400. Initially, 
the system is equipped with a divergent design p R j" A that is tuned 
with a training workload from the first query distribution. The 
monitoring module continuously computes a divergent design p R ^ A 
based on a sliding window of the last 60 queries in the workload, 
and outputs the improvement on ExpTotalCostQ if p^ A were used 
instead of p RI 'x A . 

FigurefTOlshows the monitoring statistics produced by the online- 
monitoring module of RITA for the TPCDS-dyn workload. Match- 
ing our intuition, the output shows that p R ^ A has near- zero improve- 
ments for the first 200 queries, since the current design p™^ A is al- 
ready tuned for the particular phase of the workload. However, as 
soon as the workload shifts to a different distribution, the output 
shows a considerable improvement of more than 60%. This can be 
viewed as a strong indication that a retuning of the system can yield 
significant performance improvements. The spike tapers off close 
to query 450, since in this experiment the workload shifts back to 
its previous distribution and hence there is no benefit to changing 
the current design. 

RITA requires 6.5 seconds on average to analyze each new query 
in this workload, and can thus generate an output that accurately re- 
flects the actual workload. We conducted similar experiments with 
different values of the length of the sliding window and observed 
similar results. For instance, RITA takes at most 8 seconds when 
the sliding window is set to 100 statements. 

Elastic Retuning. After observing the monitoring output, the DBA 
can invoke the recommender module to examine different recom- 
mendations for retuning the system in an elastic fashion. The next 
set of experiments evaluate how fast RITA can generate these rec- 
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ommendations and also their quality. 

We employ a scenario that builds on the previous experiment on 
online monitoring. Specifically, we assume that the DBA invokes 
the recommender using the sliding window of 60 queries that corre- 
sponds to the spike in FigureQI)] Moreover, the DBA specifies two 
dimensions of interest with respect to a new divergent design: the 
workload-evaluation cost and the cost of materializing the design. 
Also, the DBA wants to study the effect of shrinking and expanding 
the number of replicas. We assume that the DBA sets the probabil- 
ity of failure (a) to be in order to allow RITA to execute fast and 
generate the output in a timely fashion. After inspecting the output, 
the DBA may invoke another (more expensive) tuning session for 
a specific choice of replicas (or routing multiplicity factor) and re- 
configuration cost, and a non-zero a. Our results in Figure [T3~lshow 
that RITA can compute a divergent design that matches the same 
level of performance as the case for a = 0. 

Figures QT] shows the output of the recommender based on our 
testing scenario. Each point (x,y) on the chart corresponds to a di- 
vergent design that requires x cost units to materialize and whose 
ExpTotalCostQ is equal to y. The three curves labeled N — z, 
z £ {2,3,4}, represent divergent designs that employ z replicas. 
We assume that N — 3 is the current setting in the system, and 
hence N = 2 (resp. N = 4) represents dropping (resp. adding) a 
replica. The chart also shows the ExpTotalCostQ metric of the cur- 
rent design, for comparison. As shown, there are several options to 
significantly improve (by up to 2x) the performance of the current 
design. Moreover, the DBA obtains the following valuable infor- 
mation: there is a least materialization cost in order to get some 
improvement; designs that require more than 53 units of materi- 
alization cost offer diminishing returns for N = 3 and N = 4; and 
there is not much benefit to increasing the number of replicas, since 
N = 3 and N = 4 have virtually identical performance. Based on 
these data points, the DBA can make an informed decision about 
how to retune the divergent design in the system. RITA requires 
a total of 50 seconds to generate the points in the chart. Note that 
the recommender does not have to initialize INUM for the training 
workload, as this initialization has already been performed inside 
the monitoring module. This short computation time facilitates an 
exploratory approach to index tuning. 

We employ another scenario that is similar to the previous one 
except that we assume the DBA wants to study the effect of using 
different values for the routing multiplicity factor, while keeping 
the number of replicas unchanged. 

Figure [12] shows the output of the recommender based on the 
above testing scenario. The three curves labeled m = z, z £ { 1 , 2, 3 } , 



represent divergent designs that have the routing multiplicity factor 
z (We assume that m = 2 is the current setting in the system). We 
observe that designs that require more than 53 units of material- 
ization cost have the same performance when routing queries for 
m = 2 as when routing queries for m = 1. This result indicates 
that we can obtain designs with some flexibility in routing queries 
(i.e., m = 2) and with the same performance as designs that have 
the most specialization (i.e., m = 1). Similarly, designs that require 
more than 89 units of materialization cost can route queries with 
the same performance for m = 1, m — 2 or m = 3. RITA requires a 
total of 25 seconds to generate the points in the chart. 

7. CONCLUSION 

In this paper, we introduced RITA, a novel index tuning advi- 
sor for replicated databases, that provides DBAs with a powerful 
tool for divergent index tuning. The key technical contribution of 
RITA is a reduction of the problem to a compact binary integer pro- 
gram, which enables the efficient computation of a (near-)optimal 
divergent design using mature, off-the-shelf software for linear op- 
timization. Our experimental studies demonstrate that, compared to 
state-of-the-art solutions, RITA offers richer tuning functionality 
and is able to compute divergent designs that result in significantly 
better performance. 
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APPENDIX 

A. PROVING THEOREM 1 

We reduce the original problem studied in [5 ] to DDT by proving 
their equivalence when a = and C contains solely a space-budget 
constraint per replica. Since the original problem is NP-Hard, the 
same follows for DDT. The result in Lemma[T](See below) is the 
key to prove their equivalence. It is important to note from Sec- 
tion l3.3l that in the general setting of DDT, ho{q) might not corre- 
spond to the m replicas with the least evaluation cost for q. 

LEMMA 1. In the problem setting of DDT when a = and C 
contains solely a space-budget constraint per replica, ho(q) corre- 
sponds to the m replicas with the least evaluation cost for q. 

We prove Lemma \T\ using contradiction. Assume that for some 
query q, there exist two replicas r\ and ri such that r\ <E ho(q), ri ^ 
ho(q) and cost(q,I ri ) > cost(q,I ri ). We then derive another routing 
function h' that is similar to h except that h' is slightly modified as 
follows: h' (q) = h (q) U {r 2 } - {n}. Clearly, TotalCost(I,h) > 
TotalCost(l,h'). This contradicts to the requirement to minimize 
TotalCost(l, h) in the problem setting of DDT. 



B. PROVING THEOREM 2 

We prove the theorem in two steps. First, we show that every di- 
vergent design (I, h) corresponds to a value-assignment v for vari- 
ables in the BIP such that v satisfies the constraints (Lemma [2}- 
This property guarantees that the solution space of the BIP con- 
tains all possible solutions for the divergent design tuning problem. 
Subsequently, we prove that the optimal assignment v* corresponds 
to a divergent design. Combining these two results, we can then 
conclude the correctness of the theorem (Lemma[3}. 

To simplify the presentation and without loss of generality, we 
prove the theorem for the basic DDT when a = 0, C = and the 
workload comprises solely queries, i.e., W = Q. 

Given a valid- assignment v, we use BIPcost(\) to denote the 
value of the objective function of the BIP under the assignment v. 

LEMMA 2. For any divergent physical design (I,h), there is an 
assignment v s.t. TotalCost(l,h) = BIPcost(\). 

LEMMA 3. Let v* denote the solution to the BIP problem. Then, 
TotalCost(l,h) = BIPcost(\*), where (I,h) is the divergent design 
derived from v*. 

B.l Proof of Lemma H 

Given a divergent design (I, h) and for every query q £ Q, using 
the linear decomposability property, we can express the cost of q at 
replica r e ho(q) as: 



cost(q,I r ) = I3 P - 



ie[l,n],a=yfi] 



for some choice of p = p r £ TPlans(q) and Y = Y p ' r £ Atom(I r ). 
We assign the values for variables as follows. 

• v(f£) = lifre h (q), 

• v(y;;) = lif> = //, r€ho(q), 

• y ( x pa) = 1 if P = P r > reh (q) and a = Y p > r [t\,i 6 [l,n], 

• \( s r a ) = 1 if a e /,-, r e [1,N], and 

• The other cases of variables are assigned value 

We observe that under this assignment, all constraints in the BIP 
are satisfied. For instance, since v(f„) = 1 when r £ ho(q) and ho(q) 
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has m values, it can be immediately derived that X/refl N]*q = m > 
i.e., constraint ((2} is satisfied. 

By eliminating terms with value 0, we obtain the following re- 
sults. 



FTotalCost(I,h, j) = ^ ^ 



/(<?) 



qeQre[lM]r\r^j 



maxm,N — 1 



cost(q,rJ) 



BIPCostM = y y V J -^-cost(q, 



m 



t(q,rj)= J2 Wr+ E 

p£TPkms(q) p£TPlans(q) 

(ie^U{SCANj}U---U{SCAN„} 



such that: 



cost(q,r) = l3p+ Y^ Ipa, for r€h (q),p = p r ,Y = Y p > r £Atom(I r ) 

ie[l,n],a=r[!] 

Thus, BIPCost(\) = TotalCost(I,ti). 

B.2 Proof of Lemma |3] 

The following arguments are derived based on the assumption 
that v* satisfies the BIP formulation. 

First, based on l|2}, we derive that for every query q, there exists a 
set S q = {r | r £ [l,N]} and |5,| = m such that y*(t r q ) = 1 iff re S q . 

Second, based on ©, we derive that for every query q and every 
r £ Sq, there exists exactly one plan p = // £ TPlans(q) such that 

v*(yp = i. 

Third, based on l|6}, there exists an atomic configuration Y p ' r , 
r £ 5 9 , p = p' that corresponds to the assignments for \(x'~ ). 

Finally, we prove that p r and Y p ' r , r 6 5 ? , correspond to the 
choice of plan p and atomic configuration Y that yields the mini- 
mum value of cost (q, I r ), by using contradiction. Combining these 
results, we conclude that BIPCost(\*) = TotalCost(l,h). 

Suppose that there exists a different choice p c £ TPlans(q) and 
Y c £ Atom(I r ), r £ 5 ? , such that cost{q,p c ,Y C ) < cost(q,p r ,Y p ' r ). 
Here, we use cost(q,p,Y) denote the cost of q using the template 
plan p and the atomic configuration Y. 

We can now derive an alternative assignment v c that is similar to 
v* except the followings: 

• Variables corresponding to p r and Y pr are assigned value 0, and 

• y (yp) — 1 if P — P° ' r £ S q , and 

• y i x pa) = 1> if P = P°' r ^ Sq ar, d a = Y c [i], i £ [l,n]. 
We observe that v c is a valid constraint-assignment for the for- 
mulated BIP. However, since BIPcost(\ c ) < BlPcost(v*), this con- 
tradicts our assumption about the optimality of v*. 
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Figure 13: Augmented BIP to handle failures. 
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C. FACTORING FAILURES 

In this section, we present the full details of how RITA integrates 
failures into the BIP. 

Under our assumption of using fast what-if optimization, the cost 
of a query q in some replica r can be expressed as cost(q,I r ) = 
costijJ ,A') for some choice of p' £ TPlans(q) and an atomic con- 
figuration A' £ Atom(I r ) We introduce the following additional vari- 
ables. 



• t„ 



1 if and only if q is routed to replica r when j fails, i.e., 



hj(q) = {r\ty = l} 

• Xpa = 1 if and only if q is routed to replica r when j fails, p = p' 
and a £ A'. 

• y"p = 1 if and only if q is routed to replica r when j fails, p = //. 

We also need to add a new set of constraints, as given in Fig- 
ure [T3] These constraints are very similar to their counterparts in 
Figure [2] The correctness of the BIP is proven in the same way as 
presented in Appendix IB"1 



D. BOUNDING LOAD-SKEW 



Figure 14: Query-Optimal Constraints 



D. 1 Additional Constraints for Exact Solution 

This section presents the set of constraints that RITA formulates 
in order to ensure the optimality of cost(q, r) with the presence of 
bounding load-skew constraints. 

RITA introduces a new cost formula cost op '(q,r) = cost(q,I r ) 
forr£ [1,N]. The formula of cost op '(q,r) is very similar to co.rt(g,r): 



the variables yo r „ (resp. 



'pa 



) have the same meaning with y'„ 



(resp. x r ). The main difference is that for r ^ ho(q), we have 
cost(q,r) = whereas cost op '(q,r) = cost(q,I r ) > 0. The atomic 
constraint in d!6t are somehow similar to the atomic constraints on 
cost(q,r). Note that in J16aK the constraint requires exactly one 
template plan to be chosen to compute cost op '(q, r) in order for this 
value corresponds to the query execution cost of q on replica r. 

To establish the optimal cost constraints, we use the following al- 
ternative way to compute cost(q,X). For each internal plan cost /3 p , 
p £ TPlans(q), we first derive a "local" optimal cost, referred to as 
C° , which is the smallest cost that can be obtained by "plugging" 
all possible atomic configurations A £ AtomiX) into the slot of the 
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template plan of j3 p . Essentially, C\ 



local . 



■Pn+I 



local 



where / 



local 
P 



is the smallest value of the total access cost using some atomic- 
configuration A £ Atom(X) to plug into the template plan of /3 p . To 

rlocal 



obtain / ' 



we enumerate 



, for each slot in the internal plan of /3 p , 
all possible indexes in X that can be "plugged" into, and find the 
one that yields the smallest access cost to sum up into IS . Lastly, 
cost q (X) is then obtained as the smallest value among the derived 
C p ocal with p<ETPlans(q). 

The right hand-side of l !17t is the formula of C° . Here, we 
introduce variables u r ; where u r = 1 iff the index a is used at slot 
( in the template plan /3 p to compute C' oefl ' . For C l " cal to correspond 
to some atomic configuration, we impose the constraint in d!8at . 

Furthermore, an index a can be used in C p p if and only if a is 
recommended at replica r (constraint d!8bt ). 

The constraint d!9t ensures that the candidate index with the 
smallest access cost is selected to plug into each slot of f3 t in com- 



puting // 



local 



of I is 



l+P 



i-ZAf-n • ^y replacing the value of j3, we obtain the load- 
imbalance factor r. □ 

Note that this greedy scheme does not encounter the aforemen- 
tioned problem with cost(q, r) not to be equal to cost(q,I r ). Infor- 
mally, the reason is due to the fact that the right hand-side of the 
inequality constraint in ( 120) is a constant. 



D.2 Greedy Approach 

This section presents our proposal of a greedy scheme that trade- 
offs the quality of the design for the efficiency. 

First, we derive an optimal design (l opl ,h opl ) assuming there is 
no load imbalance constraint and the probability of failure is 0. We 



then compute an approximation factor /3 : 
following constraint into the BIP. 



-1 



1+(/V-1)t 



and add the 



load(I,h,r) < 



(l+P)TotalCost(I opr ,h opt ) 

N 



Vre[l,iV] (20) 



This constraint is an easy constraint, as its right handside is a 
constant. We prove that if the BIP solver can find a solution for 
the modified BIP, the returned solution is a valid solution and has 
TotalCost(l,h) bounded as the following theorem shows. 

THEOREM 3. The divergent design returned by the greedy so- 
lution satisfies all constraints in DDT problem and has 
TotalCost(I,h) < (l+/3)TotalCost(I opl ,h, 



opt) 



□ 



PROOF. We overload I opl (resp. I) to refer to the total cost of 



the design l opt (resp. I) as well. 

The maximum load of a replica in I is 



l+ff)4, 



(due to the con- 



straintl20t. By summing up the load of all replicas in I, we obtain: 
I < (1 + j3)I op t- Therefore, I differs from I op , by an approximation 
ratio (1 +/?). All remaining issue is to prove that I satisfies the 
load-imbalance constraint. 

Without loss of generality, assume that load(l.T) < load(j,T), 

V/e[2,JV] ; 

Since I is load-imbalance, we can derive the followings: 



■■ S~] load(j,l) + load{l,I) 

M2,N] 

(i+/?)V 



N 



> load(j, I) 



(N-l) 

N 



(l+f3)Io P t+load(l,i)>l 



I > lopt 



lo<ul[l.l)-_ ( 1-^-^(1 + O) )/, r / 



(21a) 

(21b) 

(21c) 
(21d) 

(21e) 



The maximum load in I is i ( 1 + a)I op , and the minimum load 



is 1- 



(Af-l 



■(1 +a) ) l op t- Therefore, the load-imbalance factor 
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